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In this paper the class of ARCH(c>o) models is generalized to 
the nonstationary class of ARCH(oo) models with time-varying co- 
efficients. For fixed time points, a stationary approximation is given 
leading to the notation "locally stationary ARCH(c)o) process." The 
asymptotic properties of weighted quasi- likelihood estimators of time- 
varying ARCH(p) processes (p < oo) are studied, including asymp- 
totic normality. In particular, the extra bias due to nonstationarity 
of the process is investigated. Moreover, a Taylor expansion of the 
nonstationary ARCH process in terms of stationary processes is given 
and it is proved that the time- varying ARCH process can be written 
as a time-varying Volterra series. 

1. Introduction. To model volatility in time series, Engle [6] introduced 
the ARCH model where the conditional variance is stochastic and depen- 
dent on past observations. The ARCH model and several of its related 
models have gained widespread recognition because they model quite well 
the volatility in financial markets over relatively short periods of time (cf. 
[3, 13]). However, underlying all these models is the assumption of stationar- 
ity. Now given the changing pace of the world's economy, modeling financial 
returns over long intervals using stationary time series models may be in- 
appropriate. It is quite plausible that structural changes in financial time 
series may occur, causing the time series over long intervals to deviate sig- 
nificantly from stationarity. It is therefore plausible that, by relaxing the 
assumption of stationarity in an adequate way, we may obtain a better fit. 
In this direction. Drees and Starica [5] have proposed the simple nonlinear 
model Xt = fi + a{t)Zt, where Zt are independent, identically distributed 
random variables and cr(-) is a smooth function, which they estimate using 
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a nonparametric regression method. Essentially, though it is not mentioned, 
the authors are treating cr{t) as if it were of the form a{t) = a{t/N), with N 
being the sample size. Through this rescaling device it is possible to obtain 
a framework for a meaningful asymptotic theory. Feng [7] has also studied 
time inhomogeneous stochastic volatility, by introducing a multiplicative 
seasonal and trend component into the GARCH model. 

In this paper we generalize the class of ARCH(oo) models (cf. [9, 16]) to 
models with time- varying parameters: 

oo 

(1) Xt = a{t)Zt where a{tf = ao{t) + ^ aj{t)Xj^j, 

i=i 

and Zt are independent, identically distributed random variables with EZj = 
0, "^Z^ = 1. As in nonparametric regression and in other work on nonpara- 
metric statistics, we use the rescaling device to develop an asymptotic the- 
ory around such a class of models, that is, we rescale the parameters to the 
unit interval [see (2) below]. The resulting process is called the time- varying 
ARCH (tvARCH) process. The same rescaling device has been used, for 
example, in nonparametric time series by Robinson [15] and by Dahlhaus 
[4] in his definition of local stationarity which was essentially restricted to 
time-varying linear processes. We shall show in Section 2 that the tvARCH 
process can be locally approximated by stationary ARCH processes. There- 
fore, this new class of tvARCH processes can also be called locally stationary. 
The stationary ARCH approximation will later be used to transfer results 
for stationary ARCH processes to the locally stationary situation. 

In Section 3 we study parameter estimation for tvARCH (p) models by 
weighted quasi-maximum likelihood methods. The nonstationarity of the 
process causes the estimator to be biased. We will show that the bias can be 
explained in terms of the derivatives of the tvARCH process. Furthermore, 
we will prove asymptotic normality of the estimator. 

In Section 4 we also define a special derivative of the tvARCH process 
and give a Taylor expansion of the nonstationary tvARCH process in terms 
of stationary processes. This derivative enables us to study more precisely 
the nonstationary behavior of the process. Moreover, the derivative process 
turns out to be a solution of a stochastic differential equation. 

In Section 5 time-varying Volterra series are studied. They are used to 
prove the existence of a tvARCH(c)o) process and to derive the results of 
Section 4 on its derivatives. It is worth noting that the results in Section 5 
are of independent interest and the methods used here can be generalized 
to other nonstationary processes. 

In the Appendix we prove convergence theorems for ergodic stationary 
processes and some specific convergence and approximation results for the 
likelihood process. We also derive mixing properties of several processes, 
including derivatives of the likelihood process. 
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2. The time varying ARCH process. In this section we broaden the class 
of ARCH(oo) models, by introducing nonstationary ARCH(oo) models with 
time-dependent parameters. In order to obtain a framework for a meaningful 
asymptotic theory, we rescale the parameter functions as in nonparametric 
regression and for (linear) locally stationary processes to the unit interval, 
that is, we assume 

Xt,N = CTt^NZt 

where = qq i^—j + ^ aj i^—j X^_j^^ for t = 1, . . . , iV, 

where Zt are independent, identically distributed random variables with 
KZf = 0, EZj^ = 1. We call the sequence of stochastic processes {Xt,N -t = 
1,...,A^} which satisfy (2) a time-varying ARCH (tvARCH) process. As 
shown below, the tvARCH-process can be locally approximated by station- 
ary ARCH processes. Therefore, we also call tvARCH processes locally sta- 
tionary. 

We mention that the rescaling technique is mainly introduced for ob- 
taining a meaningful asymptotic theory, and by this device we can obtain 
adequate approximations for the nonrescaled case. In particular, the rescal- 
ing does not effect the estimation procedure. Furthermore, classical ARCH 
models are included as a special case (if the parameters are constant in 
time). 

We make the following assumptions. 



Assumption 1. The sequence of stochastic processes {Xj^at :i = 1, ... ,A^} 
has a time-varying ARCH representation defined in (2) where the parame- 
ters satisfy the following properties: There exist constants < p,Q, M < oo, 
< z/ < 1 and a positive sequence such that inf„ ao{u) > p and 

(3) sup aj(n) < ^ 



oo -. 

(4) 

Iu — v\ 

(5) \aj{u) -aj{v)\<M- 



where satisfies 
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An example of such a positive sequence {^(j)} is 

with some k > or £(j) = rj^ for some rj > 1. Condition (4) imphes that 
¥,{X^j^) is uniformly bounded over t and A^. 

Proposition 1. Under Assumption 1, {X^j^} defined in (2) has an 
almost surely well-defined unique solution in the set of all causal solutions. 
The solution has the form of a time-varying Volterra series expansion. 

The proof for Proposition 1, as well as all the other proofs of results in 
this section can be found in Section 5. We mention that a similar result also 
holds for nonrescaled tvARCH(oo) processes. 

It is worth noting that throughout this paper we shall be working with 
rather than Xt^N-, unless stated otherwise. This is because Xt^N can 
be randomly either positive or negative, whereas Xf is always positive, 
allowing it to be unique. 

The smoothness of the parameters {aj{-)} guarantees that the process 
has (asymptotically) locally a stationary behavior. We now make this no- 
tion precise. The first point of interest is to study the stationary process 
which locally approximates the tvARCH-process in some neighborhood of 
a fixed point to (or in rescaled time uq). For each given uq G (0,1], the 
stochastic process {Xtiuo)} is the stationary ARCH process associated with 
the tvARCH(oo) process at time point uq if it satisfies 

, , Xt{uo) = at{uo)Zt, 

(6) 

where at{uof = ao{uo) + ^aj{uo)Xt-j{uof 

i=i 

for all t E Z. It is worth noting, if the parameters {aj(tio)} satisfy Assump- 
tion 1, then {Xt{uQ)} is a stationary, ergodic ARCH(c!o) process (cf. [9]). 
Comparing (6) with (2), it seems clear that if t/N is close to uq, then X^j^ 

and A'j(uo)^ should be close and the degree of the approximation should 
depend both on the rescaling factor and the deviation \t/N — uq\. This is 
shown below. 

Theorem 1. Suppose {Xi^n} is a tvARCH process which satisfies As- 
sumption 1 and let Af(no) be defined as in (6). Then there exist a stationary, 
ergodic, positive process {Ut} independent of uq with finite mean and a con- 
stant K independent of t and N such that 

(7) \Xl^-Muof\<K{^ 



A-"° 



+ aI^* 



a.s. 
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We mention that an exphcit formula for Ut is given in (48). As a conse- 
quence of (7), we have 





t 






( 


N 




4) 



The bound in (7) allows us to approximate the local average of X^j^ by an 

average of Xt{uo)'^ (this is of particular interest here, since the local average 
and weighted local average will be used frequently in later sections). For 
example, suppose \uo — to/-^| < 1/-^ and we average X'^j^ about a neighbor- 
hood whose length (2M + 1) increases as N increases but where the ratio 
M/N — > as N ^ oo. Then by using (7), we have 

^ M ^ M 



where i?Ar is bounded by 



.M 



I. 



M 



Thus, about the time point to the local average of a tvARCH process is 
asymptotically the same as the local average of the stationary ARCH process 
{Xf(tio)^}. Therefore, by using (7), we can locally approximate the tvARCH 
process by a stationary process. The above approximation can be refined by 
using derivative processes as defined in Section 4. By using them, we can 
find, for example, an expression for the asymptotic bias in (8). 

3. The segment quasi-likelihood estimate. In this section we consider a 
kernel type estimator of the parameters of a tvARCH(p) model given the 
sample {Xt^N :t = 1, . . . , N}. The process {Xt^N} is assumed to satisfy the 
representation 

Xt,N = CTt^NZt, 

(9) / ^ X P / ^ X 

where = oq i^—j + ^ (j^j Xf^j,N for f = 1 . . . , iV, 

where Zt are independent, identically distributed random variables with 
KZt = 0, = 1. The order p is assumed known. We study the distribu- 
tional properties of the estimator, including asymptotic normality. Further- 
more, we will investigate the bias of the estimator due to nonstationarity of 
the tvARCH (p) process. We will use the following assumptions. 



Assumption 2. The sequence of stochastic processes {Xf^N ■t = l,..., N} 
has a tvARCH (p) representation defined by (9). Furthermore: 
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(i) The process satisfies Assumption 1. 

(ii) For some 5 > 0, 

(10) E(|Zt|4(^+'5))<oo. 

(iii) Let ft be the compact set 

r2 = jcK = (Qo,ai, . . . ,ap) :^aj < l,pi < ao < P2,Pi < for i = 1, . . . 

where < pi < p2 < c>o. For each u E (0, 1], we assume sl^ G Int(O), where 
SLu = (ao(n),ai(n), . . . ,ap(n)). 

(iv) The third derivative of aj(-) exists with 

d^ttj {u) 



sup 



<C 



for 2 = 1, 2, 3 and j = 0,1, . . . ,p, where C is a finite constant independent of 
i and j. 

(v) The random variable Zt has a positive density on an interval con- 
taining zero. 

(vi) [This assumption is only used in Theorem 3(ii)] 



Remark 1. (i) The conditions placed on the parameter space in As- 
sumption 2(iii) can be relaxed to include all vectors a = (qq, ai, . . . , Op), 
where = for any i = 1, . . . ,p, in the parameter space. By including these 
points, a method for model selection could be derived. However, the cost for 
relaxing this assumption is that additional moment conditions have to be 
placed on Xt^N- 

(ii) We use Assumption 2(ii) to prove asymptotic normality of the esti- 
mator. Typically for stationary ARCH processes, the result can be proved if 
E(Z^) < oo. However, we require the mildly stronger assumption E(|Z|j "'''') < 
oo to prove a similar result for sums of martingale arrays as opposed to sums 
of martingale differences used in the stationary situation (cf. [10], Theorem 
3.2). Assumption 2(vi) means that both E{Xl%) and E{Xt{uy^) are uni- 
formly bounded in t, and u. We refer also to the comments on the moment 
assumptions in Section 6. 

(iii) In Section 5 we apply a theorem of Basrak, Davis and Mikosch [1], 
who gave conditions under which a GARCH(p, q) process is mixing. Assump- 
tion 2 (iii) is sufficient for the Lyapunov exponent of the random recurrence 
matrix associated with {Xt{u)} to be negative. In addition. Assumption 
2 (iii), (v) is sufficient to ensure that the ARCH process {Xt{u)} is a-mixing 
of rate — cxo (see [1] and references therein). 
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We now define the segment (kernel) estimator of a(iio) for eacli G (0, 1). 
Let to £ such that \uo — tQ/N\ < 1/N . The estimator considered in this 
section is the minimizer of the weighted conditional likelihood 

N -, /f_l.\ 

(11) Ao,7v(«):= E 

k=p+l 



where 



1/. , , 



(12) 



Kfc,Af(a) = ^( logu;fc,Ar(Q) + '''^ 



p 

with 'Wk,N{a) = ao + Y^ ajXl_j^^ 
i=i 

and 1^ : [—1/2, 1/2] ^ M is a kernel function of bounded variation with 

1/2 1/2 

/_;/2 W{x) dx = 1 and /_;/2 a;W^(2;) dx = 0. That is, we consider 

(13) afo^AT = argmin£to,Ar(a). 

Obviously (.t,N{oL) is the conditional likelihood of Xt^N given Xt-i^N , ■ ■ ■ ■, 
Xt-p,N and the parameters a = (oQ) • • • i ctp)^, provided the Zt are normally 
distributed. All results below also hold if the Zt are not normally distributed 
but simply satisfy Assumption 2. For this reason (and the fact that the 
conditional likelihood is not the full likelihood), the likelihood is called a 
quasi-likelihood. For later reference, we list the derivatives of lk,N{(^)- Let 
^ = (af^ ' • • • ' al;)^- S™^^ V'^Wk,N{oL) = 0, we have 

1 ( X"^ ~\ 

(14) 4,^(a) = - log(u;fe,^(a)) + ^ , 

, . lf Vu;fc,jv(Q) Xl^VwkMa) 

(15) V4,Ar(Q:) = -i T-To 



(16) 



2 i u;fc,Ar(Q:)2 



+ 2- 



a^o^TV is regarded as an estimator of sHq/n = (cio(*o/-^)) ■ • ■ > o^p(*o/-^))^ or of 
a„Q, where |tio — to/N\ < 1/N. 

In the derivation of the asymptotic properties of this estimator we make 
use of the local approximation of X'^j^ by the stationary process Xf(iio)^ 
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defined in Section 2. Similarly to the above, we therefore define the weighted 
likelihood 

TV , /f_U\ 

(17) Cn{u,,cc):= E 

k=p+l 

where |uo - to/^| < 1/^ and 

lt{UQ,OL) = - \0gWt(UQ,OL) + — 



(18) 

with wt{uQ,OL) = Qq + ajXt^j{uf)Y . 

It is obvious that the same formulas as in (14)-(16) also hold for lj^{uQ,Q.) 
with ^ and Wk,N{ct) replaced by Xk{uo)'^ and Wk{uo,a), respectively. 
It is shown below that both Ct^^^j\f{a) and CN{uQ,a) converge to 

(19) £(no,a):=E(4(no,a)) 

as — > oo, b ^ 0, bN — > oo and |mo ~ ^o/-^| < 1/N . It is easy to show that 
C{uo,a) is minimized by a = a^Q. 
Furthermore, let 

13to,N{a) := £to,N{a) - £N{uo,a) 

= E ^W^(^)(4,Tv(a)-4(no,a)). 

k=p+l 

Since Cj\[{uQ,a) is the likelihood of the stationary approximation, 
Xt{uQ)BtQ^]\r{a) is a bias caused by the deviation from stationarity. Lemma 
A. 6 implies that Bt^^Nioi) = Op{b). A better rate will be derived by a Taylor 
expansion in Proposition 3. Let 

(21) S(no) = lE(^^°^"°'^7^^^°l;°'^-°^^l. 

2 L Wo{uo,Siuo) J 

Since Xk{uQ) /wk{uQ,aLQ) = Z\ and Z^. is independent of w;fc(tio, ao), we have 

E(V24(txo,a„J) = -S(no) 

and 

E(V4(^^o,ano)V4(7xo,a„J^) = ^^^^^^0) ^(^0)- 
If Zt is Gaussian, then var(ZQ) = 2. 
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Lemma 1. Suppose {Xt^N : t = 1, . . . , A^} is a tvARCH(p) process which 
satisfies Assumption 2(i), (iii) and let Cn{uo^ol), C{uo,a) and Bt^^N be as 
defined by (17), (19) and (20), respectively. Then 

(22) sup |£Ar(uo,Q:) - £(uo,a)| ^ 0, 

(23) sup|eto,jv(«)| ^0, 

cxen 

(24) sup\V^£N{uo,a)-V^C{uo,a)\^0 
and 

(25) sup|V2^to,jv(a)|^0, 
where 6 — > and bN — > co as N ^ oo. 

A direct implication of the lemma above is the following corollary. 

Corollary 1. Let Ct,Nict) be as defined as in (11). Then under the 
assumptions in Lemma 1, we have 

(26) sup|£t„,Ar(a) - £(iio,a)| ^ 
and 

(27) sup\V^ Ct,^N{a) - V^Ciuo,a)\ ^ 0, 
where 6^0, bN oo as N ^ oo. 

In the theorem below we show that at,j^Ar is a consistent estimator of a^Q. 

Theorem 2. Suppose {Xt^N ■t = l,. . . , N} is a tvARCH(p) process which 
satisfies Assumption 2(i), (iii) and the estimator a.to,N is as defined in (13). 
Then if \uq — to/N\ < l/N , we have 

ato,Ar a(noj, 
where 6 — > and bN — > oo as N ^ oo. 

Proof. By using (26), we have pointwise convergence £to,Af (a) ~^ >C(uo, a) 
Since a^^ = argminQ,£(iio, ct), we have 
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With (26), we now obtain Cto^N{^to,N) — > ^{uo,sluq). From the continuity of 
C{uq, •) and the compactness of the parameter space, we can now conclude 

^to,N — >a„o' provided C{uq,-) has a unique solution. Since C{uo,-) is the 
same function as in the stationary case, this follows from Lemma 5.5 of [2]. 
□ 

We now prove asymptotic normality of the estimator with the usual Taylor 
expansion argument. We have 

(28) V£to,N{^to,N)i - V£t(j,7v(a„„). = {V^£j(,,7v(aJ„^jv)(^to,Af - ^uo)}i, 

with aj^ ^ between at,j^jv and a„,j. Since a^^ is in the interior of ^1, we 

have VbN\/CtQ^Niato,N)i ^ 0. Since aj^^^ ^ a^^ and sup^^gj^ \V'^Cto,N{a) - 

V'^C{uo,a)\^0 [see (27)], then V2£t„^jv(aj^^^) ^ -S(no). Note that S(uo) 
is nonsingular. This follows from Lemma 5.7 of [2] since T,{uq) is the same 
as in the stationary case. 

Therefore, the distributional properties of SLt^^N — ^uq are determined by 
VAo,Ar(^uo)- using (20), we see that 

(29) VCt, ) = V£Ar(no, a^J + VBto 

which is essentiahy a decomposition into a stochastic and a bias part [al- 
though VBtQ^Ni^uo) is also random, but its variance is of a lower order — see 
the details below]. The bias measures the deviation from stationarity and 
wiU disappear for a suitable choice of bandwidth b (see Proposition 2 and 
Theorem 3 below). By substituting (29) into (28), we have 

VbN{{ato,N -a„„) + S(no)~-^Vfij(,,7v(a„J) 

= \/WVE(uo)"V£7v(t^o,a„o) + Op(l). 

Thus, the asymptotic distribution of (atg^N — ^uo) is determined by V£Ar(no, a^^ 
Note that this is the gradient of the likehhood of the stationary process 
Xtiuo)"^ however, with kernel weights. Since 

2 Wk{uo,auo) 

is a martingale difference, V£7v('U0i a^o) is the weighted sum of martingale 
differences. 

Proposition 2. Suppose {Xt^N -.t = 1, . . . ,N} is a tvARCH(p) process 
which satisfies Assumption 2(i), (ii), (iii) and £Ar(no,aug) is as defined in 
(17). Then if \uo - to/N\ <1/N we have 

(31) VbNV£N{uo,au,) ^Af(o, «;2^^^^S(no)) , 
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where 6^0, bN —>■ oo, N ^ oo and W2 = jly2 ^(^)^ dx. 



Proof. Since V£7v(tio, a^^) is the weighted sum of martingale differ- 
ences, the result follows from the martingale central limit and the Cramer- 
Wold device. It is straightforward to check the conditional Lindeberg con- 
dition and the conditional variance condition. We omit the details. □ 



We now consider the stochastic bias VBtg^Ni^uo)- By using (30) and 
Lemma A. 6, we immediately get the relation 

(32) VBt,,NM = Op{b). 

This bound together with the Proposition 2 leads to the assertion of The- 
orem 3(i) below. As mentioned above, the stochastic bias is a measure for 
the deviation of the process {it^Ni^uo)} from stationarity. This deviation 
depends on the rate of change of the parameters {aj{u)}. Under stronger 
moment conditions, we will now determine this bias. To achieve this, we 
replace V4,7v(a„J by Vik{jf,auo)- 



(33) VBt,,NM = Y.-^W 

k 

where 



tp-k 
bN 



V4(tio,a^ 



tp-k 
bN 



V4,Ar(a„ 



V4( ^,a„o 



Rn, 



Corollary A.l now implies 



V4,Jv(a„o)-V4( 



< 



K 

N 



Uk + {i + zi)Y,Uk-A, 

V j=l ) 



with some constant K uniformly in k. Lemma 1 together with the indepen- 
dence of Z\ and f/fc-j now imply 



Suppose for each j = 0, . . . ,p the third derivative of aj(-) exists and is uni- 
formly bounded. Then by using Corollary 3 and taking a Taylor expansion 
of V4( 

'U,a^((j) about u — uq, we have 



k 



V4(^^o,a„(,) 



k 
N 



Up 



dV£k{u, a. 



du 



(34) 



+ 



ik/N-uo)^d^V£kiu,^iuo) 

2 du^ 
{k/N-uofd^V£kiu,auo) 



3! 



U=Uq 



u=Uk 
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where the random variable Uk S (0, 1] . A detailed investigation of the dif- 
ferent terms now leads to the following result on VBiq^n^'^uq)- We mention 
that, in particular, the expectation of the first term cancels out. 

Proposition 3. Suppose {Xt^N ■.t = l,. . . ,N} is a tvARCH(p) process 
which satisfies Assumption 2 and W is a kernel function of bounded varia- 
tion with J^y2 ^(^) dx = 1 and f^J^j^ ^{^x)^<^^ = 0- Then if \uq — to/-^| < 
we have 



where 1^(2) = f_^i^W{x)x^dx. 

A detailed proof can be found in Appendix A. 4. 

Propositions 2 and 3 and (32) give us the distributional properties of the 
estimator atg^N, which we summarize in the theorem below. 

Theorem 3. Suppose {Xt^N ■t= 1, . . . , N} is a tvARCH(p) process which 
satisfies Assumption 2(i), (ii), (iii) and W is a kernel function of hounded 

1/2 1/2 

variation with Jj^^^W{x) dx = 1 and Jj^^^^i^)^ dx = 0. Then if \uo — 
to/N\ < 1/N , we have the following: 



(ii) If in addition Assumption 2(iv), (v), (vi) holds and b^"^ <^ N , then 




and 







and 



(35) 




where 



(36) 



«^(2)S(no) 



- 'U=Uq 
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Remark 2. (i) We recall the structure of this result: The asymptotic 
Gaussian distribution is the same as for the stationary approximation. In 
addition, we have a bias term which comes from the deviation of the true 
process from the stationary approximation on the segment. In particular, 
this bias term is zero if the true process is stationary. A simple example is 
given below. By estimating and minimizing the mean squared error (i.e., by 
balancing the variance and the bias due to nonstationarity on the segment), 
we may find an estimator for the optimal segment length. 

(ii) If EZq = 3, as in the case of normally distributed Zt, then 

VbN{eLt,^N - a„J + VbNb^fiiuo) ^ AA(0, u;2S(no)-^). 

(iii) It is clear from Propositions 2 and 3 that 

m{uor'Bt,,NM\\l=b^^^{uo)\\l+o(b^+^ 



and 



E||S(7Xo)-'V£;v(^o,a„J||^ = «;2^^^^trace(S(7xo)-i) 
Therefore, if b^^ ^ A^~^, using the above, we conjecture that 

^'^^ = b'Muo)g + u;^^^ trace(S(.o)-^) + «(^' + ^) • 

However, this is very hard to prove. The b which minimizes the conjectured 
mean square error would be the theoretical optimal bandwidth (i.e., the 
optimal segment length). 

(iv) We illustrate the above results with an example. We first consider 
the tvARCH(O) process 

which Drees and Starica [5] have also studied. In this case ■ 
and under Assumption 2, we have 

a2V£(n,a„J _ la'o'(no) ^^^^^ _ 1 



du^ }u=uo 2ao(uo)2 2ao(Mo)^' 
that is, 

/^(-"o) = -lw{2)aQ{uo). 

This example illustrates well how the bias is linked to the nonstationarity 
of the process — if the process were stationary, the derivatives of ao(-) would 
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be zero, causing the bias also to be zero. Conversely, sudden variations in 
ao(-) about the time point mq would be reflected in aQ(tio) and manifest as 
a large /i(uo)- Straightforward minimization of the first two summands in 
(37) leads to the optimal bandwidth, which in this case (and for Gaussian 
Zt) takes the form 



^opt 



2/5 



leading to a large bandwidth if Og(uo) is small and vice versa. Thus, the 
optimal choice of the bandwidth (of the segment length) depends on the 
degree of stationarity of the process. For general tvARCH(p) processes //(uq) 
is very hard to evaluate. Furthermore, it assumes a very complicated form. 

(v) It is of interest to investigate whether the differences in the kernel- 
QML at each time point are because the true ARCH parameters are time- 
varying or are simply due to random variation in the estimation method. 
From a practical point of view, one could evaluate the sum of squared devi- 
ations between the kernel-QML estimator at each time point and the global 
QML estimator. We conjecture that the asymptotic distribution under the 
null hypothesis of stationarity is a chi-square. 

4. The derivative process. A key element to the proof of Theorem 3 is 
the notion of the derivative of the process Xt{u)'^ with respect to u and the 
resulting Taylor expansion for the nonstationary process Xfj^ in terms of 
stationary processes as given in Corollary 2 below. Since these "derivative 
processes" are of general interest, we introduce them in this section for 
general tvARCH(oo) processes Xt^N as given in (2) and Xt{u) given in (6). 
We need the following stronger regularity conditions on the parameters. 

Assumption 3. The third derivative of {% (•)} exists. Furthermore, 



(38) sup 



u 



C 

< ——^ for i = 1, 2, 3 and i = 0, 1, . 



where is defined as in Assumption 1 and C is a finite constant indepen- 
dent of i and j. 

Theorem 4. Suppose Assumptions 1 and 3 hold and let {Xt{u)} he 

defined as in (6). Then the derivatives { ^^g^"^ }? ^m"^ } o-nd ^u^^ i 
are almost surely well defined unique stationary stochastic processes for each 

u G (0,1). Furthermore, ^^g^^^ is almost surely the unique solution of the 
stochastic differential equation 

(39) = I "o(^) + 2^aj{u)Xt^j{u) +}^aj{u) , 

\ j=i j=i J 
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where a'j{u) denotes the derivative of aj{u) with respect to u. 

Note that (39) is just the derivative of (6) with dt{uY replaced by Xtiu)"^ / . 

An explicit formula for ^^g^"^ is given in (51). Similar expressions also hold 
for the second and third derivatives. For example, if ah the derivatives of 
aj(-) were zero also the derivative process would be zero [in this case X"^ 
would be stationary and Xt{u)'^ = X"^ for all v\. 

An important consequence of Theorem 4 is that it allows us to make a 
Taylor expansion of Xt{u)'^ about uq (rigourously proved in Section 5), to 
give 

Xt{uf = Xt{uQf + {u-UQ) *^ ' ' "-^2 n . 



(40) -^^n^^J -^y- -u; 

+ Op{{u-uof). 



An interesting feature of the Taylor expansion in (40) is that it does not 
depend on the existence of moments of Xtiu)"^, unlike other types of se- 
ries expansions. Instead the expansion depends on the smoothness of the 
parameters aj(-). 

The approximation in (7), where X^j^ = Xtij^)"^ + Op{l/N), and the Tay- 
lor expansion in (40) lead to the corollary below. 

Corollary 2. Suppose {Xt^^} is a tvARCH process which satisfies 
Assumptions 1 and 3 and let Xt{u) be defined as in (6). Then for any 
uq G (0, 1], we have 



^ ' 1ft \^d^Xt{uf 



U=U{) 



u=uo 



The nice feature of the result of Corollary 2 is that it gives a Taylor ex- 
pansion of the nonstationary process X^j^ around Xt{uo)'^ in terms of sta- 
tionary processes. This is particularly nice since it allows use of well-known 
results for stationary processes (such as the ergodic theorem) in describing 
properties of Xt^N- A similar result also holds for higher-order expansions 
with higher-order derivatives. However, in this paper only a second-order 
expansion is needed. 

As an example, we now use (41) to derive a tighter bound for the remain- 
der Rn in (8). The effect is similar as in nonparametric regression: Due to 
the anti-symmetry of the kernel weights, the expectation of the first term 
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falls out. By using (41), we have, for |to/-^ — uo\ < 1/N, 



Rn 



1 ^ k dXt,+k{uf 
2M + l.^„Af du 



k=-M 
M 



, 1 i(ky d'x,^+,{uy 



+ 0, 



U=UQ 



Af\3 1 



1 



The expectation of Ti is zero. Under the additional condition E(Zg)^/^ Qj 

^(i) ^ (1 — I')) ^^dni^ } is a short memory process in which case var(ri) = 
0{M/N'^) (see Lemmas A. 7 and A. 10). Thus, T2 dominates Ti in probability 
and we have 



Rn 



1 ^ lfkVd^X,,+,{uf 



2M + 



2 ViV 



+ 0, 



U=U{) 



MY 

n) + 



N 



Note that this is a (stochastic) bias of the approximation in (8). 

Theorem 4 and Corollary 2 can easily be generalized to include derivatives 
of functions of tvARCH processes. By using the chain and product rules, we 
have the generalization below, which we use to study the quasi-likelihood 
defined in Section 3. 

Corollary 3. Suppose Assumptions 1 and 3 hold, let {Xt{u)} he as 
defined in (6) and f :M.'^ ^M., where the first, second and third derivatives 
of f exist. 



(i) Then 



(42) 



dfiXtAu?,...,XtAnr) 
du 

d^f{XtM\---,XtAu?) 



E 



dXtSuf df 
du dXui^r 
j^d^Xuiuf df 

1=1 



du^ dXuiuo) 



^ ^ dXt^ju^ dXt^ju)' 



du 



du 
d'f 



dXt^iuoYdx^iuoY^ 



Furthermore, by using the product and chain rules, similar expressions can 
be obtained for . 
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(ii) Suppose / : M"^ — > M is differentiable with uniformly bounded third deriva- 
tive. Then we have 

fiXl,^,^, Xl,^^^) = nMu?) + (1 - no) 

(43) {t/N-u,fd'f{Mu?) 

2 (9?/2 

+ O,((l-no)%^), 
«;/iere ±t{uf := {Xt+t, (uf, . . . , Xt+t, {uff. 

5. Volterra expansions of tvARCH processes. In this section we prove 
the existence and uniqueness of the process Xt^N and of the derivative pro- 
cess from Section 4. This is done by means of Volterra expansions. The 
methods used here can easily be generalized to include other nonstation- 
ary stochastic processes which have as their solution a Volterra expansion. 
Therefore, the results and methods in this section are of independent inter- 
est. A treatise of ordinary Volterra expansions can be found in [14]. 

Giraitis, Kokoszka and Leipus [9] have shown that a unique solution of 
Xt{u)'^ , defined in (6), is almost surely the Volterra series given by 



fc>i 



(44) Xt{uf = ao{u)Z^ + Y,rht{u,k), 

where 

fht{u,k) 



/ k \ k 

"o(?^) n n ^t-y. • 

k 

9u{k,jo,ji,...,jk)Wz]^, 

jk<---<jo-jo=t i=0 



with 



k 

QuikJoJi, ...,jk)= ao{u) Yl 

We now show a similar result is true for Xf j^. Let aj{u) = for u < and 
J > 0. A formal expansion of X^ defined in (2), gives 

(45) Xl^ = ao (^) Zl + Y ^t,N{k), 



k>l 
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where 

Jl,---,Jfe>l ^r=l ^ / \r=0 

A: 

= X! 9t,N {k,jo,ji,..., jk) n ^i, ' 

jk<--<jo- jo=t i=0 

with 

9t,N{k,jo,ju...,jk) = ao(^^^f[a(^j^_^^j^)(^^y 

We stated in Proposition 1 that the tvARCH process has a unique solution. 
We now prove this result by showing that (45) is the unique solution. The 
proof in many respects is close to the proof of Theorem 2.1 in [9]. 

Proof of Proposition 1. We first show that (45) is weh defined. 
Since (45) is the sum of positive random variables and the coefficients are 
also positive, we only need to show that the expectation of (45) is finite. 
By using (3), (4) and the monotone convergence theorem, a bound for the 
expectation of (45) is 

oo k Q 

lE(Xt^Ar) < supao(n) +supao('u) ^ ^ H ^/ ■ _ ■ 7 

(46) _ " k=ijk<-<jo.jo=ti=i^^^^ 



< supao(M) 



CO 

\k 



i + E(i--)* 



k=l 



< OO. 



Furthermore, it is not difficult to see that X^j^ is a well-defined solution of 
(2). 

To show uniqueness of Xfj^, we must show that any other solution is 
equal to Xfj^ with probability one. Suppose Y^j^^ is a solution of (2). By 
recursively applying relation (2) r times to Y^j^, we have 

Thus, the difference between Y^^j^ and X^j^^ is 



where 



X?jV ~ ^LN — — Br 



Ar = ^mt^N{k) 

k=r 
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and 

y2 r— 1 

We now show, for any e > 0, that J^'^i'^il'^r - Br\ > e) < oo. By us- 
ing (3) and (4), we have K(Ar) < C(l — i^Y . Furthermore, since Y^j^ is 
causal, Y^^ ^ and 111=0 ^1 independent (if i < r, then ji > jr)- There- 
fore, E{Yl^U:ZoZl)=nYl^) and we have 

< ■ f \ supE(y,^^)(i-z.r- 

inf„ao(ti) t,Ar 

Now by using the Markov inequahty, we have P(Ar > < ^1(1 — z^)^/e and 
P(B^ > e) < Ci(l - vY je for some constant Ci. Therefore, P(|A - > 
e) < C2(l - vY je. Thus, E~iIP(l^r - Sr| > e) < 00 and by the Borel- 
Cantelh lemma, the event {| A^ — > £} can occur only finitely often with 
probability one. Since this is true for all e > 0, we have Yt^jq "=' Xt^N and 
therefore the required result. □ 

Remark 3. It is worth noting that mt^N{k) can be obtained by using 
the recursion 

'mt,N{k) = Zf E (T7j'^t~j,N{k - 1) for k>2, 



with the initial condition 

N 



mt,N{l) = Zl E ( ^ ) Zl-r 



Our object now is to prove Theorem 1, that is, to bound the difference 
between X^j^ and XtiuoY . More precisely, we will prove under Assumption 
1 that 

(47) 1^2^ - Xt(uof\ < - ^o| + ^) ^t, 
where 

(48) ^,=z,veq'"' e T^r^Tp^n^. 



fc-1 k\jo-ik\ -rr ^2 

fc=l jk<---<jo ■■ jo=t nj=l ^(it-l — ji) i=o 



is a stationary ergodic positive process with finite expectation. 
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Proof of Theorem 1. To prove (47), we use the triangle inequality 
to get 



\Xl^-Xt{uof\< 



N 



+ 



N 



and consider bounding \X^j^ — and \Xt{j^)'^ — Xt{uQ)'^\ separately. 

By using (44) and (45), we have 



^t,N - Xti — 



t V 



N 



(49) 



-Yl 5Z \9t,Nik,jo,ji,---,jk) 
k>ljk< --<jo ■■ jo=t 



9t/N{k,jo,ji,---,jk)\Y[zl. 



i=0 



We notice jo/N — t/N. By successively replacing a(j^_^_j.)(^^^ 

) by 

OQ-._j_j^)(^), by using (3), the Lipschitz continuity of the parameters in 

(5) and that (jo - jk) > {jo - ji) (for i < k) and (jo - jk) = Ei=i(ji-i - ji), 
we have 

\gt,N{k,jo,ji, ...,jk)- 9t/N{k,jo,ji, ■■■,jk)\ 
■ifc-i k\jo-jk\ 



(50) 



A^nii^(j-j*-i)' 

where X is a finite constant. Therefore, by using (49) and (50), we have 



N 



< K—Ut. 

- N 



Now we bound \Xt{jj) — Xt{uo)^\. By using (44), we have 



xt{uor-xA- 



< 



N 



ao(uo) - ao( 



Z 



+ 51 \9uo{k,jo,ji,---,jk) - 9t/N{k,joJi,---,jk)\Y[zl. 

k>ljk< - <jo - jo=t i=0 

By using similar methods to those given above, we have 



Xt{uof-Xt 
Therefore, we have shown (47). 



< K 



t 

N 



Uo 



Ut 



STATISTICAL INFERENCE FOR TIME- VARYING ARCH PROCESSES 



21 



We now show that Ut is a weh-defined stochastic process. Since Ut is the 
sum of positive random variables, we only need to show that E(C/t) < oo. 
Taking the expectation of Ut, using (4) and the independence of {Z^}, we 
have 

oo I ' ■ I 

k=ljk<-<jo-jo=t ni=lHJi Ji-1) 



oo 



< l + L^A:2(l-zy)'=-i <oo, 
k=l 

where L = Y^'jLij/Kj) [L is finite by definition of Thus, {Ut} is a 

well-defined process with finite mean. By using Stout [17], Theorem 3.5.8, 
we can show that {Ut} is an ergodic process. Hence, we have the result. □ 

We now prove Theorem 4 on the existence of the derivatives of Xt{u)'^ 
with respect to u. We will show that this is given by sums of the derivatives 
of the mt{u,k) terms in (44), that is. 



(51) 



k>lji,...Jk>^\r=l / r=0 

k / k \ k 

+«o(n)EE E ) n n^'-^:^^jv 

fc>l n=l ji,...,jj;>l \r=l,r^n / r=0 " 



This leads to the Taylor expansions of Xt{u)'^ [as stated in (40)] and fi- 
nally to the Taylor-type representation of X^j,^ stated in Corollary 2. The 
latter two results are proved below. Throughout the rest of the section 
X^ j^{uj), Xt{'>J',(jj)'^ , and so on, denote a specific realization of X^j^, Xt{uf' . 

Proof of Theorem 4. From (44), we know that Xt{uf has almost 
surely a Volterra series expansion, given by (44), as its unique solution. 
Therefore, there exists a subset Ni{u) of the event space where P(A/i(tt)'^) = 
1 and 



Xtiu^Lof = aQ{u)Zt{uj)^ 

(52) 



(k \ k 

_ r=l / r=0 



y uj G Mi{uy. Furthermore, since the random process {Ut}, defined in (48), is 
well defined (see Theorem 1), there exists a set M2 with P(A/'2 ) = 1 and Ut{uj) 
finite, for all to £ J\f2 - For iv £ J\f3{uY = J\fi{uY r\N2, we consider realizations 
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of the right-hand side of (51) and 

/ k \ k 

Vt = sup I a[) (n) I Zf + sup | a[) (n) | ^ ^ J] sup aj^ (n) W u 

fc>lji,...Jfe>lVr=l " /r=0 
k ( ^ \ ^ 

+ supao('u)^^ ^ sup|a^.^(u)| J]^ supajv(n) 

" fc>ln=l ji,...,jfc>l " \r=l,r7^n " / ■r=0 

We will now use the following result: Suppose f(x) = J2'jLi9j{^) for £ 
[0, 1], where / is a deterministic function. It is well known if J2j^i9j{^) is 
uniformly convergent [which is true if X^j^i sup^ |(7j (a;)| < oo], the deriva- 
tives are finite and J2'j^i 9j{x) converges at least at one point, then f'{x) = 
J^j^i 9j{x)- We now use this result to show that the derivative of Xt(u)'^ is 
well defined. Suppose a; £M3{uY. Then by using (52) and (48), we have 

Xt{u, uj)'^ < max(l, Q) sup aQ{u)Ut{uj) < oo, 

u 

where the summands in Ut{uj) are absolutely and uniformly summable. Fur- 
thermore, under Assumption 3 and (4), we have, for all uj £ A/'3(n)'^, 

Vt{uj) < (sup\aQ{u)\ + ^sup|ao(u)|^C/t(u;) < oo. 

Therefore, = Ytiu^uj) is almost surely given by (51). By using [17], 

Theorem 3.5.8, it is clear { ^'^q^^^ } is an ergodic process. 

To show that (51) is the unique solution of (39), we can use the same 
method as given in the proof of Theorem 1. We omit the details here. 

We can use the same method as described above to show that ^l/^^ i 
and 'du^'' } uniquely well-defined ergodic processes. Again, we omit 



At this point it is easy to derive some moment conditions on Xt{u) and 
its derivatives. 

Lemma 2. Suppose {Xt^N ■t= 1, . . . , N} is a tvARCH(oo) process which 
satisfies Assumptions 1 and 3 and, in addition, 

forr>l. Then E\Ut\'' <oo, E| sup„ Xt(u)2|^ < oo, E| sup„ ^^|[^|'' < oo and 
E| sup„ ^-^fr^r < °o uniformly in t. 
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Proof. The result follows by applying the Minkowski inequality to (48), 
(44), (51) and the corresponding formula for the second derivative and using 
arguments similar to those in (46). We omit the details. □ 

Proof of (40) and Corollary 2. We first prove (40). For uj G 
M^{uY r\ J\f'i{uQY , with A/s as defined above, the Volterra series expan- 
sions (44) give solutions of (6) for Xt{uY and Xt{uQf'. The relation (40) 
now follows from an ordinary Taylor expansion of Xt{u,uj)'^ about uq, not- 
ing that J^( ^ ^u^'^ \u=u^ ^ ^'■^ arbitrary random variable U . 

By using (40) and 



we obtain Corollary 2. □ 

6. Concluding remarks. We have studied the class of nonstationary 
ARCH(c)o) processes with time-varying coefficients. We have shown that, 
about a given time point, the process can be approximated by a stationary 
process. Moreover, this approximation has facilitated the Taylor expansion 
of the tvARCH process in terms of stationary processes. It is worth mention- 
ing that the existence of the derivatives of the coefficients determines the 
existence of the derivatives of the process and the subsequent Taylor expan- 
sion (and not the existence of the moments). The definition of the derivative 
process and the Taylor expansion is not restricted to tvARCH(c!o) processes, 
and with simple modifications can also be applied to other nonstationary 
processes. 

To estimate the time- varying parameters of a time- varying ARCH(p) 
(p < oo) process, we have used a weighted quasi-likelihood on a segment. 
Investigation of the asymptotic properties of the estimator showed an extra 
bias due to nonstationarity on the segment. This expression can be used 
to find an adaptive choice of the segment length (by minimizing, e.g., the 
mean squared error and estimating the second derivative). The relevance of 
this model for (say) financial data needs further investigation. We conjecture 
that, by using tvARCH models, the often discussed long range dependence 
of the squared log returns can be reduced drastically and even disappear 
completely (there has been some discussion that the long range dependence 
of the squares is in truth only due to some nonstationarity in the data; see 
[12]). Furthermore, we conjecture that, for example, the empirical kurtosis 
of financial log returns is much smaller with a time- varying model than with 
a classical ARCH model. 

Typically for stationary ARCII(p) processes, the existence of E(Zq) is 
assumed in order to show asymptotic normality of the quasi-likelihood esti- 
mator. A drawback of our approach is that the expression of the bias given in 
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(36) holds only under the assumption [E(Zq^)]-'^/^ ^^=1 < (1 — that is, 
under the existence of the 12th moment. However, if we assume the weaker 
condition K{Zq'^^) < oo, then the segment quasi-likelihood estimator still 
has asymptotically a normal distribution, but the explicit form of the bias 
cannot be evaluated (see also Remark 1). 

We mention that, unlike the case of stationary GARCH(p, q) models, 
the time-varying GARCH model is not included in the tvARCH(oo) class. 
The investigation of time-varying GARCH(p, q) models is a topic of fu- 
ture research. However, unlike tvGARCH models, the squares of certain 
tvARCH(cx)) models have "near" long memory behavior (cf. [9, 11]). This 
is one justification for studying tvARCH(c)o)-models. 

An important issue not discussed in this paper are the practical aspects 
when the model is applied. In particular, identifiability requires investigation 
since both conditional heteroscedasticity and time varying parameters are 
suitable to model volatility. Theoretically the model is identifiable and we 
are convinced this also holds in practice for large data sets. However, it has 
to be checked whether this leads to satisfactory results for moderate sample 
sizes. Our idea is that the conditional heteroscedasticity models the short 
term fluctuations, while the time varying parameters model the longer term 
changes. Of course this can be achieved by a sufficiently large choice of the 
bandwidth. 

APPENDIX 

In this appendix we establish the results required in the proofs of Section 

3. 

Many of the results related to the local quasi-likelihood defined at (11) de- 
pend on the asymptotic limit of the weighted sum of nonstationary random 
processes. The general method we use to deal with such sums is to substitute 
an ergodic process for the nonstationary process, and to study the limit of 
a weighted sum of an ergodic process. In Appendix A.l we establish results 
related to the weighted sums of ergodic processes. These results are used 
in Appendix A. 2, where we study the difference between the nonstationary 
tvARCH process and the corresponding approximating stationary processes. 
We then use this result to evaluate the limit of weighted sums of functions 
of tvARCH(p) processes. In Appendix A. 3 we investigate the mixing prop- 
erties of the likelihood process and in Appendix A. 4 the bias of the segment 
estimate from Section 3. 

A.l. Convergence results for weighted sums of random variables. In this 
section we prove ergodic type theorems for weighted sums of ergodic pro- 
cesses. In the lemma below we show an almost sure convergence result and 
in Lemma A. 2 we prove convergence in probability for certain triangular 
arrays. 
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Lemma A.l. Suppose {Yt} is an ergodic sequence with ¥.\Yt\ < oo and 
W :[— 1/2,1/2] — > M is a kernel function of bounded variation with 

I-i/2W{x)dx = l. Then 



M 1 / k \ 

y — Wi— hfc^'/i asM^oo, 

^±^^2M + 1 \2M + lJ ^ 

where n = E.(Yq). 
Proof. Since 

(A.l) y W( ]^ Wix)dx = l, 

^ ' ^£^^j2M + l \2M + lJ 7_i/2 ^ ' 



we can assume without loss of generality that fj, = 0. We split the sum into 
negative and positive suffixed elements, which gives 

1 ^ f k \ 1 ^ ^ f k-\- M \ 

(A.2) ^^^ = 2MTT,^„^lwTTj^'= + ^AmS^VwT^ 



Nm + Pi 



M, 



and consider first Pm- By using summation by parts, we have, with Sk 



1 



M-l 

2M+T ^ 

k=l 



k \ „J k+1 



W — -W 



2M + 1J \2M + 1 



Since W is of bounded variation, this yields 

with some constant K. Now the ergodic theorem implies Sk{Lv)/k — > for 
almost all uj. It is obvious for these uj that also Pm(<^) tends to zero. In the 
same way we conclude that Nm — > a.s., which gives the result. □ 



For kernel estimates about arbitrary center points, the situation is more 
difficult since we basically average over triangular arrays of observations. We 
therefore prove in the following lemma only convergence in probability. 

Lemma A. 2. Suppose {Yt} is an ergodic sequence with < oo and 
W :[— 1/2,1/2] ^ M. is a kernel function of bounded variation with 
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I-i/2W{x)dx = l. Then 



^ 1 ^^JuQ-k/N 



(A.3) /ijv(no):= ^ Tm^[ u J^fc^/^ /or ^xq G [0, 1], 

where 6^0, ^ c« as N ^ oo, and /i = E(yo)- 

Proof. Again we consider only the case /.t = 0. Suppose N > Nq with 
A^'o such that uq — p/Nq > 6o/2 and uq — 1 < — 6o/2, = ^(-^o) [i-e-i the sum 
in (A.3) is over the whole domain of W]. Let ko = kQ^N) be such that \uq — 
ko/N\ < 1/N. Since {Ifc} is stationary, fj,iy{uo) has the same distribution as 

^ 1 ^^/uo-k/N\ ^ 1 fuo-ko/N-k/N\ 
56iV^l^^j^^-^« = 56iV^l b n 

Since W is of bounded variation, this is equal to 

k 

with Rn < sup_ft^<^.<j;v \ Yk\. Lemma A.l implies that the first term con- 
verges to zero almost surely [the proof of Lemma A.l remains the same with 
(2M + 1) replaced by bN]. Since |Yfc| < \Sk\ + \Sk-i\, where Sk = J2i=iYi, the 
second term also converges to zero almost surely (as in the proof of Lemma 
A.l). Therefore, 

n\l^N{uo)\>e) = n\P'N{uo- ko/N)\>e) ^0, 

which gives the result. □ 



A.2. Convergence of the local likelihood and its derivatives. In this 
section we evaluate the limit of weighted sums of {it^Nioc)}, {V£t,Nic()} , 
{V'^^t,Ar(Q;)} and the corresponding stationary approximations. In particu- 
lar, we prove Lemma 1. Recall the formulas (14)-(16) and the corresponding 
formulas for £tiuo,a). Let k= '^^^ and 

with the ergodic process Ut from (48). For a better understanding of the fol- 
lowing result, we note that we have, for |uo — ^| < 1/A^, E(Af(,^7v(^tO) Ut, a)) = 
0{N~^) and therefore, 

At,,N{uo,Ut,a) = Op{N-^), 



(A.4) 



^ 1 



a. 



N 



Uo 
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uniformly in uq and a. The same holds for Z^^At^^N since Zf and At^N are 
independent with E{Z^) = 1. 

In the following lemmas we derive upper bounds for the expressions occur- 
ring in (14), (15) and (16) and for the difference between these expressions 
and the corresponding expressions in £t{uo,a) and its derivatives. Assump- 
tion 2(iii) immediately yields 

(A.5) ' <—, —— — 

uniformly in t, N, uq and a {i = 1, . . . ,p + 1). 

Lemma A. 3. Suppose {Xt^N :t= 1, . . . , N} is a tvARCH(p) process which 
satisfies Assumption 2(i), (iii). Then 

(A.6) ^k-<^z2 and 2^1^ < 

wt,N{oL) wt{u,a) 



Proof. We only prove (A.6) for the tvARCH case; the proof for the 

2 _ 72 ^2 

t,N — ^t,N"t,N^ 



stationary case is similar. Since X^j^ = Z^j^a^j^^, we have 



^2 / «o(ViV)+E^=i«,(^/A^)^t,, jv 



N 



ao{t/N) 



< kZ?. 



The last line is true because J2'j=i < 1 and Uj > pi for j = 0, . . . ,p. 

□ 

Lemma A. 4. Under the assumptions of Lemma A.3, we have 

= -^7 T + rii^N[uo,t), 

wt,N{a) wt{uo,OL) 

(A.7) 



where \Ri N{uo,t)\ < — 
Pi 



t 



Ut + —ZfAt,N{uo,Ut,a), 
Pi 



, . ^ + R2,N[U0,t) [1 = 1,..., p + 1), 

wt,N{a) wt{uo,a.) 
(A.8) 2 

where \R2,Niuo,t)\ < — Aj Ar(no, a) 
Pi 
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and 
(A.9) 



where |i?3,Ar(uo, t)| < —At^N{uo,Ut,a). 
Pi 



Proof. We first prove (A. 7). We have 



|^l,Ar('"0,*)l < 



(A.IO) 



Wt,N{oc) Wt,N{oi.) 

Xt{uof\ 



+ 



Wt,N{oi) Wt{uo,Cx) 



<-\xl^ 



Xtiuo)'^ I / \ ~ / M 
H — -\wt,N{a) - wt{uo,a)\. 



piWt{uo,cx) 

From the definitions of wt^Ni^i) and wt{uo,Oi) and by using (7), we have 



(A.ll) 



t-j 



N 



Mo 



+ 



N 



t-j 



Together with (7) and (A. 6), this leads to (A. 7). Since Vwt^N{ct)i = X^j^-^_^ ^ 
for i = 2, . . . ,p + 1, the proof of (A. 8) is almost the same, so we omit the 
details. The case i = 1 also follows in the same way. 

We now prove (A.9). By differentiating log{wt^N{ct)) with respect to 
Xf_j and using the mean value theorem, we have 

1 

(A.12) R3,Niuo,t) 



where {Yj -.i = 1, . . . ,p) are positive random variables [since both X^_^ ^ and 
Xt-i{uQ)'^ are positive and Yj lies in between]. Therefore, by using (7), we 



have 



\R3,N{uo,t)\<—iY, 



a. 



Pi 



Kj = l 



t-j 



N 



Uq 



+ \Ut-j I < —At,N{uo, Ut, a), 



Pi 



which is the required result. □ 



Corollary A.l. Under the assumptions of Lemma A.3, we have, for 
n e N, 



Ilr=l'^Wt^N{a)i, Ilr=l'^MuO,a) 



(A.13) 



+ RA,N{uO,t), 



2n 



where | i?4^7v (^^o , I < -^Aj,Ar(no, a) 
Pi 
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and 
(A.14) 



where |i?5,Ar(iio, t)| < 

Pi 

where < ir <p for r = 1, . . . , n. 



1 \ 2Kn 
+ ^ r* + —Z?^t,N{uo, Ut, a), 
N J pY 



Proof. We can prove (A. 13) by successively replacing Vwt^N{oi.)ir/f^t,N{c() 
by \7wt{uQ,a)i^/'Wt{uQ,a.) for r = l,...,n. Then by using (A. 8) and the 
bound > pi for all a S ri, we have the result. We can prove (A.14) by 
using a similar method as above together with (A. 6) and (A. 7). We omit 
the details here. □ 

Lemma A. 5. Suppose {Xi^n} is a tvARCH(p) process which satisfies 
Assumption 2(i), (iii) and W is a kernel function of bounded variation with 

/-1/2 ^(^) dx = 1. Then we have 
^ bN \ bN 



(A.15) *:=P+i 

V „fXo{uof Ur=i ^woiuo, a 



^ bN \ bN J Wk{uQ,cxY ~^ \ wq{uo,cxY 



k=p+l 

(A.16) 
and 



(A-17) E ^VF(^^jlog(u;fc(no,a))^E(log(u;o(no,a))). 



N 

E 

fc=p+i 

Proof. Since < 1/pi, we have by using (A.6) that 

wt{uo,a)'^ ~ p1 

By using [17], Theorem 3.5.8, the process { ^'^"°^ St'(-Mo,«)"*^"°'"^"' 

godic and by using the bound above has finite mean. By applying Lemma 

A. 2, we have verified (A.15). 

Since logpi < logwt{uQ,a) < (logp2 + (1/pi) E^=i ai-^t-j(^^o)^), 
logtDt(tiO) ct) has a finite mean. (A.16) and (A.17) follow similarly. □ 
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Lemma A. 6. Under the assumptions of Lemma A. 5 and {to/N — uq\ < 
1/N with uq G (0, 1), we have, for all n £ N, 



(A.18) 



^ 1 fto-k 

sup > TT^W 



bN 



Op{b), 



^ I fto-k 



bN 



(A.19) 



and 



l\r=l'^Wk{uo,a)i 



Opib) 



N 



(A.20) sup Yl ^w(\-^)\log{wkM<^))-log{wk{uo,c^))\=Op{b). 
^en,^^,bN \ bN 



Proof. Let 



^fcK)'n"=i'Vu;fc(no,a) 



i()fc(no,Q;)" 

We note first that if q G then ai < max(l, P2), where Oi is the ith element 
of the (p + l)-dimensional vector a. By using (A. 14) and |-^ 

bN 



p\ when k lies in the support of W{H-j^), we have the bound 



k—j I ^ I k—j 



N 



N 



k=p+l 



bN 



bN 



N 



where Vfc = | [/^ + Z| J2 ^k-j | 

and C is a finite constant. Since Ut-j (by Theorem 1) and have finite 
mean and are independent when j >1, {Vt} has a finite mean. Therefore, 

by using (A. 3), we have that Ln E(Vb) and \Rn\ = Op{b). Thus, we have 
proved (A.18). 
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By using (A. 13) and (A. 9), we can obtain (A. 19) and (A. 20) similarly. 

□ 

Proof of Lemma 1. We first show (22). To prove uniform convergence, 
it is sufficient to show both pointwise convergence and equicontinuity in 
probability of CNiuo,a) (since is compact). By using (A. 15) and (A. 17), 
for every a € $7, we have 

k=p+l 

X log u;fc(no, a)) + — 

V Wkiuo,a)J 

^ C{uo,a), 

where b^O, bN oo as N ^ oo. We now show equicontinuity in probability 
of CNiuo,a). By the mean value theorem, for every ai,a2 G there exists 
an a S such that 

|^Af('»o,Qi) - ^N{uo,a2)\'^ 
\\ai — CX.2\\2 

< ||V£jv(no,a)||i 

/ Vwk{uo,a) Xfc(uo)^Vwfc(Mo,Q:) \ ^ 
V Wk{uo,a) Wk{uQ,a.Y ) 2 

By using (A. 5), we have 

2pi V P\ J 

Therefore, we have that £Ar(uo,-) is equicontinuous in probability. Now 
by pointwise convergence of >CAr(tto, a), equicontinuity of L^iuQ^a) and 
the compactness of fi, we have uniform convergence of the kernel quasi- 
likelihood. 

By using (A. 18) and (A. 20), it is straightforward to verify (23). (24) and 
(25) can be proved by using the same method as above. □ 



- 2,^,bN \ bN ) 



32 



R. DAHLHAUS AND S. SUBBA RAO 



A. 3. Mixing properties of the likelihood process. We now investigate 
the mixing properties of Xt{u)'^ and later {Vit{u, sluq)} and their derivatives 
with respect to u. Our object is to show that the sums of the absolute values 
of the covariances of the process {V^j(ii, a^g)} and its derivatives are finite 
under suitable regularity conditions. To achieve this, we use a well-known 
theorem of Gallant and White [8] which states that J2k I cov(Y't, y^+fc)! < oo 
if {Yt} is a L2-Near Epoch Dependent (L2-NED) process of size —00 on 
the mixing process {Xt} of size —00 (see Lemma A. 10 below). To use this 
result, we need an appropriate mixing process {X^}. To this end, we use a 
result of Basrak, Davis and Mikosch [1], who have shown that a stationary 
ARCH(p) process is a-mixing with a geometric rate (thus having size —00) 
if Assumption 2(iii), (v) is satisfied. Therefore, the stationary ARCH(p) 
process {Xtiu)'^}, under Assumption 2(v), (vi), is a-mixing with size —00. 
We will use this fact in the following lemmas, where we will show that both 

the processes and are L2-NED on {Xt{u)^}t. 

Let .F*^- = a{Xt.M^ . . .,Xt+m{ur) and E*i-(y) =E(y|^*^-). 



Lemma A. 7. Suppose {Xf^N : i = 1, . . . , A^} is a tvARCH(p) process which 
satisfies Assumption 2(i), (iii)-(v). 

(i) IfE{Z^y/'E, ig) < (1 - then {%#^}, and {^%^}* are 
L2-NED of size —00 on {Xt{u)'^}t (i = 0, . . . 

(ii) // E(4)V4^^. 4 < (1 - then ^s L,-NED 
of size -00 on {Xt{u)'^}t (ij = 0, . . . 

Furthermore, {Xt{u)'^} is a-mixing of size —00. 

Proof. That {Xtiu)'^} is a-mixing of size —00 follows from [1]. We first 
prove (i) for i = 0: 



^ fdXAuf f dXtiufW"^ 



where a^ has a geometric rate of decay [thus the derivative process is L2- 

jt+m I dXt(u) 



NED of size -00 on {Xt{uf}]. Since under the quadratic norm ^^^^^^( ^^ff" ' 
is the best projection of onto the sigma algebra J-'tl^^ then 
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for all S G J^t-m - We assume from now on that m > 2p. Inspired by (51), we 
now choose 



m—pk+l 



k=l r=lt[m)<jk<-<jo=t 

jk + l=jk 



fc + 1 

n 



i=0 



where t(m) = t — m + p [the index jk+i is introduced to avoid special treat- 
ment of ao{u)]. It is clear that Zf,..., ^t^_„+p G ^t~m^ therefore S^^ G ^t-m- 
It is straightforward to show that the following difference can be partitioned 
as below: 



(A.23) 
where 

and 

We have 
(A.24) 



du 



oo fc+1 / \ fc 

E E E n n^: 

fc=m+l— p r=l ifc<---<.7o=t \i=l^i^r I «=0 



m-pfc+1 



L Jfc<---<Jo 
jk+l=jk 



fc+1 

[ 



^2 



fc+1 



EE E n n 4- 



fc=l r=l jk<---<jo=t 

jk<t{m),jk+i=jk 



i=Q 



du 



ST 



< IIA 



m\\2 



m.\\2- 



Our object is to show that the mean square error of (A.24) has a geomet- 
ric rate of decay, which, by the inequality in (A. 22), implies (A. 21). We 
now bound H^mlb and H-Bmlb- Under Assumption 2(iv), there exists a C* 
such that sup^ |aj('w)| < C*Q/i{j) for j = 1, . . . ,p. Therefore, by using the 
Minkowski inequality, (3) and (4), we have 



imib < sup {C*ao{ui) + |ao(u2)|) 

Ul,U2 

oo fc k 



Q 



(A.25) 



E E E n f(^. ^ _ „• N 

k=m+l-pr=lj^<--<jo=ti=l 



4N(fc+l)/2 



< sup(C*ao(7Xi) + |a'o(n2)|) H'^ - 



k=m+l—p 



m—p 
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where i^T is a finite constant. Now we bound H-Bmlb- Since aj{u) = for 
j > p, all ji^i — ji have to be < p in order to have a nonzero contribution in 
Bm- Since for jk <t — m + p, 

k 

Y^{3i-i - ji) = h -jk>m-p, 

i=l 

this can only be true for k > {m — p) /p. Therefore, 



m—p k+l 



k+l 



k=[{m-p)/p] r=l jk<---<jo=t 

jk+l=jk 

which gives 

WBmh < sup (C*ao(ni) + 100(^2)1) 



i=0 



Ul,U2 



m—p 



Q 



(A.26) 



X E E E ^^(n, 



4^{k+l)/2 



k=[{m-p)/p] r=ljk< - <jo= 

00 

<sup{C*aoiui) + \a'o{u2)\) E - '^)'' 

"^'"^ fc=[(m-p)/p] 
= i^(l-l/)(™-P)/P, 

where is a finite constant. Therefore, by using (A. 25) and (A.26), we have 



(A.27) 



du 



■E 



t—m 





< 




2 



SI' 



< 2K{{1 - i/)Vp)™-p, 
thus giving a geometric rate for (A. 21) and the required result. 



For {^^%^ 



}, the result below follows in the same way by using S\ 



m—i 
t-i 



instead of 5^. For { ^^t^^W^ }^ use the product S^^-'S"^-/ in- 



stead of S'[™. Since 



du 



du 



Qin~i am— J 



< 



dXt_i{u) 



du 



+ IPt-i Il4 



dXt~jiu)' 



du 



dXt-^iu) 



qrn-j 



du 



om—i 



and ll^m.-ilU and ||i?m-i||4 also have a geometric rate of decay, we also 



obtain L2-NED of size —00 in this case. The L2-NED property for ' 



is proved in a similar way. We omit the details. □ 
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In Lemma A. 9 we generalize the above result to derivatives of {V£j(n, 
with respect to u, which we use in Corollary A. 2. We will also need the lemma 

below, which gives conditions under which moments of ^ exist. 

Lemma A. 8. Suppose {Xt^N :t= 1, . . . , N} is a tvARCH(p) process which 
satisfies Assumption 2(i), (iii)-(v) and, in addition, 



(A.28) 



(E(Zo^-))V-^-^<(l 



for r >1 and s G N. Then 

d''Vit{u,auo)i 



Esup 



< oo for i = 1, . . . ,p + 1, 



and the expectation is uniformly bounded in u. 

Proof. We first consider V^j(u,a„„). It is worth noting 

(A.29) V£t(u,a„J^ = - - 
and 



(A.30) 



_ dit(u,a{uo)) 
dai-i{u) 



for i = 2, . . . ,p + 1. We first prove the result for the case s = 1. By using 
Corollary 3(i), we have 



(A.31) 



du 



du 



,=0 - dXt^.inf 
By using (A.29) and (A.30), if a^g G Q, we have 



(A.32) 



dXt{u) 



< K and 



d\/£t{u, a. 



■«o/« 



dXt-jiuY^ 



<K(1 + Zf 



for j = l,...,p, 

where K is a finite constant. By using (A.31), (A.32) and the independence 



of and ^^gj''"^ , we obtain 



dVit{u,ai, 



du 



< 



dXtiu) 



du 



dXt-,{uf 



du 



thus, 
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^2 



■UoJl 



du 




du 



dXt-jiuf 



du 



(A. 28) and Lemma 2 now imply the result. 

To prove the similar result for the higher order derivatives (s > 1), we use 
the same method as above. But in this case we require stronger conditions 
on the moments of Xt{u)'^ [see (A. 28)]. The proof is straightforward and we 
omit the details here. □ 

We now use the result above to show that { ^^^'g^'^""^' } is L2-NED on 

Lemma A. 9. Suppose {Xi^n} is a tvARCH(p) process which satisfies 
Assumption 2(i), (iii)-(v). 

(i) // (E(Z^))V2^ ^ < (1 _ then the process 

^ dVit{u,a.uo)i 
\ du 

is L2-NED of size —00 on {Xt{u)^}. 

(ii) // (E(Z^))i/^X;i !§)<(!- i'), then the process 

is L2-NED of size —00 on {Xt{u)^}- 

Proof. We first prove (i). Let m > 2p. By using (A. 32), we have 



dViti 



du 

p 

<E 

j=0 



< K 



du 



dVit 

dXt-j{uY I du 



■n 



t+m 



du 



dXtiu? 



du 



t+m 



( dXt{u) 
\ du 



+ ||(1 + Z2)||2^ 



du 



t—m 



du 
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Lemma A. 7 now implies that {^^^^^§^^}t is L2-NED of size —00 on 

The proof for the second derivative process is similar, but requires the 
stronger moment condition given in (ii). We omit the details of the proof. 

□ 

We now state a theorem of Gallant and White [8] which we use in Corol- 
lary A. 2. 

Lemma A. 10. Suppose the stationary process {Yt} is L2-NED of size 
—00 on {^f}, which is an a-mixing process of size —00, and we have E,(Yq~^^) < 
00 for some 6 > 0. Then 

00 

^\cov{Yt,Yt+s)\<oo. 

s=0 

Corollary A. 2. Suppose {Xt^Ar :t = 1, . . . , A^} is a tvARCH process 
which satisfies Assumption 2(i), (ii), (iv) and (v) and, in addition, for some 
6>0: 

(i) // (E(Zo'('+')))V(2+<5) ^ < (1 - u), then we have 



(A.34) 



cov 



du du 
(ii) // (E(Zo'('+')))i/(4+<5) Q < (1^), then we have 



< 00. 



(A.35) J2 



s=0 



COV 



< 00. 



Proof. The condition (E(Zo^^+^^))i/(2+<5) _g_ < _ ^) implies 
(E(Z^))i/2^^. < (1 _ jy) by Holder's inequality. Now by Lemma A.9(i), 

we have under this assumption that }f is L2-NED of size —00 

on the a-mixing process {Xt(?xo)^}t. Therefore, all the conditions in Lemma 
A. 10 are satisfied and (i) follows. The proof of (ii) is the same, but the 
stronger condition given in (ii) is required. □ 

A.4. The bias of the segment quasi-likelihood estimate. 

Proof of Proposition 3. Substituting (34) into (33) gives 
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Cn 



to — k\ f k 



bN V bN 



5u 



We now consider the expectation of AAr(uo). We have 



E(i7vK))=E 



O 



du 

dVik{u,a.uo) 



U=Uq / ^ 



du 



u=uo/ J — 1/2 



1/2 1 



-w 



tp-k 
bN J\N 

— ]xdx + O 



no 
1 

N 



Furthermore, we have 
1 



vai{AN{uo)) 



{bNf 



to - ki 
bN 



X W 



X cov 



to - k2 \ f k 
bN 



k2 



< 



62 



(6iV)2 



E^ 

k 



tp-k 
bN 



du 



N 



u=uo 

to — k — s 
bN 



du 



cov 



du 



(6iV)2 



E 



cov 



^0 - 

/ dVlk{u, a, 



9V4+s(ti,a. 



■UO/ 



du 



U=Uq 



dVlk+s{u,SLuo) 
du 
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where ||l^||oo = sup^W{x). By using Corollary A. 2, we have that the sum 
of the absolute values of the covariances is finite. This gives 



xE 

s 



cov 



du 



Mo 



du 



Mo 



o 



bN 



O 



In the same way we obtain 



El B 



N 



and 



vari B 



2 f^-^d'^^ C{u,aL^ 



¥w{2)- 



+ 



O 



bN 



O 



We now evaluate a bound for E(C'jv(no)^), which will help us to bound 
both '¥j[Cn{u{))) and var(Civ(^io))- By using Lemma A. 8, we have 



IE(C^(no)2) 



[bN)'- 



to - h 



w 



X E 



bN 



bN 



b^ ( 
<— -r^E ; 



/93V4(^/,a„J^2 



(9^V42(ti,a 



«=C/fe 



xEE^ 

fcl fc2 



to - ki 



bN 



W 



to - k2 



bN 



<6°||VF||^E sup 



0{b% 



leading to the result. □ 
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